245 research outputs found
CERIAS Tech Report 2004-11 OACERTS: OBLIVIOUS ATTRIBUTE CERTIFICATES
We propose Oblivious Attribute Certificates (OACerts), an attribute certificate scheme in which a certificate holder can select which attributes to use and how to use them. In particular, a user can use attribute values stored in an OACert obliviously, i.e., the user obtains a service if and only if the attribute values satisfy the policy of the service provider, yet the service provider learns nothing about these attribute values. To build OACerts, we propose a new cryptographic primitive called Oblivious Commitment Based Envelope (OCBE). In an OCBE scheme, Bob has an attribute value committed to Alice and Alice runs a protocol with Bob to send an envelope (encrypted message) to Bob such that: (1) Bob can open the envelope if and only if his committed attribute value satisfies a predicate chosen by Alice. (2) Alice learns nothing about Bob’s attribute value. We develop provably secure and efficient OCBE protocols for the Pedersen commitment scheme and predicates such as =, ≥, ≤,>,<, ̸ = as well as logical combinations of them.
Differentially Private Projected Histograms of Multi-Attribute Data for Classification
In this paper, we tackle the problem of constructing a differentially private
synopsis for the classification analyses. Several the state-of-the-art methods
follow the structure of existing classification algorithms and are all
iterative, which is suboptimal due to the locally optimal choices and the
over-divided privacy budget among many sequentially composed steps. Instead, we
propose a new approach, PrivPfC, a new differentially private method for
releasing data for classification. The key idea is to privately select an
optimal partition of the underlying dataset using the given privacy budget in
one step. Given one dataset and the privacy budget, PrivPfC constructs a pool
of candidate grids where the number of cells of each grid is under a data-aware
and privacy-budget-aware threshold. After that, PrivPfC selects an optimal grid
via the exponential mechanism by using a novel quality function which minimizes
the expected number of misclassified records on which a histogram classifier is
constructed using the published grid. Finally, PrivPfC injects noise into each
cell of the selected grid and releases the noisy grid as the private synopsis
of the data. If the size of the candidate grid pool is larger than the
processing capability threshold set by the data curator, we add a step in the
beginning of PrivPfC to prune the set of attributes privately. We introduce a
modified quality function with low sensitivity and use it to evaluate
an attribute's relevance to the classification label variable. Through
extensive experiments on real datasets, we demonstrate PrivPfC's superiority
over the state-of-the-art methods
Differentially Private Grids for Geospatial Data
In this paper, we tackle the problem of constructing a differentially private
synopsis for two-dimensional datasets such as geospatial datasets. The current
state-of-the-art methods work by performing recursive binary partitioning of
the data domains, and constructing a hierarchy of partitions. We show that the
key challenge in partition-based synopsis methods lies in choosing the right
partition granularity to balance the noise error and the non-uniformity error.
We study the uniform-grid approach, which applies an equi-width grid of a
certain size over the data domain and then issues independent count queries on
the grid cells. This method has received no attention in the literature,
probably due to the fact that no good method for choosing a grid size was
known. Based on an analysis of the two kinds of errors, we propose a method for
choosing the grid size. Experimental results validate our method, and show that
this approach performs as well as, and often times better than, the
state-of-the-art methods. We further introduce a novel adaptive-grid method.
The adaptive grid method lays a coarse-grained grid over the dataset, and then
further partitions each cell according to its noisy count. Both levels of
partitions are then used in answering queries over the dataset. This method
exploits the need to have finer granularity partitioning over dense regions
and, at the same time, coarse partitioning over sparse regions. Through
extensive experiments on real-world datasets, we show that this approach
consistently and significantly outperforms the uniform-grid method and other
state-of-the-art methods
Locally Differentially Private Heavy Hitter Identification
The notion of Local Differential Privacy (LDP) enables users to answer
sensitive questions while preserving their privacy. The basic LDP frequent
oracle protocol enables the aggregator to estimate the frequency of any value.
But when the domain of input values is large, finding the most frequent values,
also known as the heavy hitters, by estimating the frequencies of all possible
values, is computationally infeasible. In this paper, we propose an LDP
protocol for identifying heavy hitters. In our proposed protocol, which we call
Prefix Extending Method (PEM), users are divided into groups, with each group
reporting a prefix of her value. We analyze how to choose optimal parameters
for the protocol and identify two design principles for designing LDP protocols
with high utility. Experiments on both synthetic and real-world datasets
demonstrate the advantage of our proposed protocol
Slicing: A New Approach to Privacy Preserving Data Publishing
Several anonymization techniques, such as generalization and bucketization,
have been designed for privacy preserving microdata publishing. Recent work has
shown that generalization loses considerable amount of information, especially
for high-dimensional data. Bucketization, on the other hand, does not prevent
membership disclosure and does not apply for data that do not have a clear
separation between quasi-identifying attributes and sensitive attributes.
In this paper, we present a novel technique called slicing, which partitions
the data both horizontally and vertically. We show that slicing preserves
better data utility than generalization and can be used for membership
disclosure protection. Another important advantage of slicing is that it can
handle high-dimensional data. We show how slicing can be used for attribute
disclosure protection and develop an efficient algorithm for computing the
sliced data that obey the l-diversity requirement. Our workload experiments
confirm that slicing preserves better utility than generalization and is more
effective than bucketization in workloads involving the sensitive attribute.
Our experiments also demonstrate that slicing can be used to prevent membership
disclosure
A framework for role-based access control in group communication systems
In addition to basic security services such as confidentiality, integrity and data source authentication, a secure group communication system should also provide authentication of participants and access control to group resources. While considerable research has been conducted on providing confidentiality and integrity for group communication, less work focused on group access control services. In the context of group communication, specifying and enforcing access control becomes more challenging because of the dynamic and distributed nature of groups and the fault tolerance issues (i.e. withstanding process faults and network partitions). In this paper we analyze the requirements access control mechanisms must fulfill in the context of group communication and define a framework for supporting fine-grained access control in client-server group communication systems. Our framework combines role-based access control mechanisms with environment parameters (time, IP address, etc.) to provide policy support for a wide range of applications with very different requirements. While policy is defined by the application, its efficient enforcement is provided by the group communication system
Optimizing Locally Differentially Private Protocols
Protocols satisfying Local Differential Privacy (LDP) enable parties to
collect aggregate information about a population while protecting each user's
privacy, without relying on a trusted third party. LDP protocols (such as
Google's RAPPOR) have been deployed in real-world scenarios. In these
protocols, a user encodes his private information and perturbs the encoded
value locally before sending it to an aggregator, who combines values that
users contribute to infer statistics about the population. In this paper, we
introduce a framework that generalizes several LDP protocols proposed in the
literature. Our framework yields a simple and fast aggregation algorithm, whose
accuracy can be precisely analyzed. Our in-depth analysis enables us to choose
optimal parameters, resulting in two new protocols (i.e., Optimized Unary
Encoding and Optimized Local Hashing) that provide better utility than
protocols previously proposed. We present precise conditions for when each
proposed protocol should be used, and perform experiments that demonstrate the
advantage of our proposed protocols
Locally Differentially Private Frequency Estimation with Consistency
Local Differential Privacy (LDP) protects user privacy from the data
collector. LDP protocols have been increasingly deployed in the industry. A
basic building block is frequency oracle (FO) protocols, which estimate
frequencies of values. While several FO protocols have been proposed, the
design goal does not lead to optimal results for answering many queries. In
this paper, we show that adding post-processing steps to FO protocols by
exploiting the knowledge that all individual frequencies should be non-negative
and they sum up to one can lead to significantly better accuracy for a wide
range of tasks, including frequencies of individual values, frequencies of the
most frequent values, and frequencies of subsets of values. We consider 10
different methods that exploit this knowledge differently. We establish
theoretical relationships between some of them and conducted extensive
experimental evaluations to understand which methods should be used for
different query tasks.Comment: NDSS 202
Relational Database Systems
In this paper, we present a comprehensive approach for privacy preserving access control based on the notion of purpose. Purpose information associated with a given data element specifies the intended use of the data element, and our model allows multiple purposes to be associated with each data element. A key feature of our model is that it also supports explicit prohibitions, thus allowing privacy officers to specify that some data should not be used for certain purposes. Another important issue addressed in this paper is the granularity of data labeling, that is, the units of data with which purposes can be associated. We address this issue in the context of relational databases and propose four different labeling schemes, each providing a different granularity. In the paper we also propose an approach to representing purpose information, which results in very low storage overhead, and we exploit query modification techniques to support data access control based on purpose information. 1
Membership Inference Attacks and Defenses in Classification Models
We study the membership inference (MI) attack against classifiers, where the
attacker's goal is to determine whether a data instance was used for training
the classifier. Through systematic cataloging of existing MI attacks and
extensive experimental evaluations of them, we find that a model's
vulnerability to MI attacks is tightly related to the generalization gap -- the
difference between training accuracy and test accuracy. We then propose a
defense against MI attacks that aims to close the gap by intentionally reduces
the training accuracy. More specifically, the training process attempts to
match the training and validation accuracies, by means of a new {\em set
regularizer} using the Maximum Mean Discrepancy between the softmax output
empirical distributions of the training and validation sets. Our experimental
results show that combining this approach with another simple defense (mix-up
training) significantly improves state-of-the-art defense against MI attacks,
with minimal impact on testing accuracy
- …